Skip to content

C++: Support definition by reference in data flow library#1000

Merged
rdmarsh2 merged 8 commits into
github:rc/1.20from
jbj:dataflow-defbyref
Mar 1, 2019
Merged

C++: Support definition by reference in data flow library#1000
rdmarsh2 merged 8 commits into
github:rc/1.20from
jbj:dataflow-defbyref

Conversation

@jbj

@jbj jbj commented Feb 27, 2019

Copy link
Copy Markdown
Contributor

The IR-based data flow library won’t become the default in 1.20, so I've added support for definition by reference to the current one. This PR doesn't support flow out of callers whose body assigns to a pointer parameter; that will still have to wait for the IR. What's supported now is flow through known library functions and manual specification of flow sources as DefinitionByReferenceNodes. I've hooked data flow up to the semmle.code.cpp.models.interfaces.DataFlow library, which immediately gives us flow through memcpy and related functions.

We need to find out whether this interface with DefinitionByReferenceNode is going to be forward-compatible with whatever is coming in the IR. We can always deprecate it, but it should at least allow feature parity between the current library and the IR-based one.

Before this PR, data flow around definition by reference was different depending on whether a variable was modelled in FlowVar.qll as SsaVar or BlockVar. It's supposed to be an implementation detail that variables can be modelled in these two different ways. Essentially, SsaVar-modelled variables would stop their data flow propagation at f(&x), while BlockVar-modelled variables wouldn't. I've changed BlockVar to do what SsaVar does.

The implementation was a bit tricky because it added cases to FlowVar where a control-flow node is simultaneously a definition and a use, similar to what I did in https://git.semmle.com/Semmle/code/pull/23501.

@jbj jbj added the C++ label Feb 27, 2019
@jbj jbj requested a review from a team as a code owner February 27, 2019 15:10
@geoffw0

geoffw0 commented Feb 27, 2019

Copy link
Copy Markdown
Contributor

Looks promising.

jbj added 4 commits February 28, 2019 08:23
This accessor may not be forward-compatible with an IR-based version,
and it's unclear whether it has any use. The `VariableAccess` remains in
the `TDefinitionByReferenceNode` constructor since it's used to
implement `getType`.
This commits also adds a test that uses `getParameter`. The new tests
demonstrate that support for array-to-pointer decay works, but we get
data flow to the array rather than its contents.
@jbj jbj added this to the 1.20 milestone Feb 28, 2019
@aibaars aibaars changed the base branch from master to rc/1.20 March 1, 2019 17:55

@rdmarsh2 rdmarsh2 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants